13 research outputs found
STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset
In recent years, automatic generation of image descriptions (captions), that
is, image captioning, has attracted a great deal of attention. In this paper,
we particularly consider generating Japanese captions for images. Since most
available caption datasets have been constructed for English language, there
are few datasets for Japanese. To tackle this problem, we construct a
large-scale Japanese image caption dataset based on images from MS-COCO, which
is called STAIR Captions. STAIR Captions consists of 820,310 Japanese captions
for 164,062 images. In the experiment, we show that a neural network trained
using STAIR Captions can generate more natural and better Japanese captions,
compared to those generated using English-Japanese machine translation after
generating English captions.Comment: Accepted as ACL2017 short paper. 5 page
Ridge Regression, Hubness, and Zero-Shot Learning
This paper discusses the effect of hubness in zero-shot learning, when ridge
regression is used to find a mapping between the example space to the label
space. Contrary to the existing approach, which attempts to find a mapping from
the example space to the label space, we show that mapping labels into the
example space is desirable to suppress the emergence of hubs in the subsequent
nearest neighbor search step. Assuming a simple data model, we prove that the
proposed approach indeed reduces hubness. This was verified empirically on the
tasks of bilingual lexicon extraction and image labeling: hubness was reduced
with both of these tasks and the accuracy was improved accordingly.Comment: To be presented at ECML/PKDD 201
Learning Decorrelated Representations Efficiently Using Fast Fourier Transform
Barlow Twins and VICReg are self-supervised representation learning models
that use regularizers to decorrelate features. Although these models are as
effective as conventional representation learning models, their training can be
computationally demanding if the dimension d of the projected embeddings is
high. As the regularizers are defined in terms of individual elements of a
cross-correlation or covariance matrix, computing the loss for n samples takes
O(n d^2) time. In this paper, we propose a relaxed decorrelating regularizer
that can be computed in O(n d log d) time by Fast Fourier Transform. We also
propose an inexpensive technique to mitigate undesirable local minima that
develop with the relaxation. The proposed regularizer exhibits accuracy
comparable to that of existing regularizers in downstream tasks, whereas their
training requires less memory and is faster for large d. The source code is
available.Comment: Accepted for CVPR 202
Action Class Relation Detection and Classification Across Multiple Video Datasets
The Meta Video Dataset (MetaVD) provides annotated relations between action
classes in major datasets for human action recognition in videos. Although
these annotated relations enable dataset augmentation, it is only applicable to
those covered by MetaVD. For an external dataset to enjoy the same benefit, the
relations between its action classes and those in MetaVD need to be determined.
To address this issue, we consider two new machine learning tasks: action class
relation detection and classification. We propose a unified model to predict
relations between action classes, using language and visual information
associated with classes. Experimental results show that (i) pre-trained recent
neural network models for texts and videos contribute to high predictive
performance, (ii) the relation prediction based on action label texts is more
accurate than based on videos, and (iii) a blending approach that combines
predictions by both modalities can further improve the predictive performance
in some cases.Comment: Accepted to Pattern Recognition Letters. 12 pages, 4 figure
Sterile protection and transmission blockade by a multistage anti-malarial vaccine in the pre-clinical study.
Peer reviewed: TrueThe Malaria Vaccine Technology Roadmap 2013 (World Health Organization) aims to develop safe and effective vaccines by 2030 that will offer at least 75% protective efficacy against clinical malaria and reduce parasite transmission. Here, we demonstrate a highly effective multistage vaccine against both the pre-erythrocytic and sexual stages of Plasmodium falciparum that protects and reduces transmission in a murine model. The vaccine is based on a viral-vectored vaccine platform, comprising a highly-attenuated vaccinia virus strain, LC16m8Δ (m8Δ), a genetically stable variant of a licensed and highly effective Japanese smallpox vaccine LC16m8, and an adeno-associated virus (AAV), a viral vector for human gene therapy. The genes encoding P. falciparum circumsporozoite protein (PfCSP) and the ookinete protein P25 (Pfs25) are expressed as a Pfs25-PfCSP fusion protein, and the heterologous m8Δ-prime/AAV-boost immunization regimen in mice provided both 100% protection against PfCSP-transgenic P. berghei sporozoites and up to 100% transmission blocking efficacy, as determined by a direct membrane feeding assay using parasites from P. falciparum-positive, naturally-infected donors from endemic settings. Remarkably, the persistence of vaccine-induced immune responses were over 7 months and additionally provided complete protection against repeated parasite challenge in a murine model. We propose that application of the m8Δ/AAV malaria multistage vaccine platform has the potential to contribute to the landmark goals of the malaria vaccine technology roadmap, to achieve life-long sterile protection and high-level transmission blocking efficacy